DOCUMENT EESOHE 



ED 109 146 



TM 004 575 



.AUTHOR 
TITLE . 

P'JB DATE 
NOTE 



EDRS PRICE ^ 
DESCRIPTORS 



IDENTIFIERS 



Vitaliano r , -Peter Paul ' 

Small Sample Comparisons of the' Cochran Q and .the 
Minimum X sub one ^squared Statistics, 
[Apr 75] ^ 

26p.; Paper ' presented at the.Annua^ Meeting of the 
American Educational Research Association ^ , 

(Washington, D*C-r MarcH 30-April 3, 1975) 

MF-$0.76 HC-$1V95 PLUS POSTAGE ' ' 

*Compara,t ive Analysis; Correlation; *Hypothesis ^ 
Testing; Matched Groups; *Nonparametric Statistics; 
*St^atistic^al Analysis ; *Tests of Significance 
*Chi Square; Cochran Q; Correlated Proportions 



ABSTl^ACT ' . ' 

The Cochran Q and the Minimum X sub one. squared 
statistics are two ways to test a hypothesis of equivalent .correlated 
proportions. This study investigated the small sample properties of Q 
and X sub one squared by Honte Carlo methods. .The ^ observed 
distributions were compared for their' rates of covergence to the 
limiting theoretical X sub one squared distribution, and for the 
degr-ee to which their ertor rates approximated, the notn-inal error 
rates. These latter comparisons allowed for idiosyncrasies that exist 
in the Q test of. the correlated proportion hypothesis.. Results show 
that* the X sub one squared statistics is more pow'erful than the Q ^ 
statistic for testing hypotheses of e-^uivalent correlated 
proportions. (Author) < 



* , Documents acquired by ERIC incXu^^e many informal unpublished * 

* materials not available £rbin other spvLvp^^p EE?C niakes every^^^ef f ort * 

* to obtain the best copy available/ neyerth^less^ ^ items of marginal * 

* reproducibility are often "encountered and ttiis ^affects the quality * 

* of the microfiche and^ hardcopy reproductions EPlc\maki^s available * 

* via the ERIC Document* Reproduction Service (EDP$)\ ^'EJDRS is not * 

* responsible for the quality of the original docAiment. Reprofluctions * 
'* supplied by EDRS are the best that can be made from the original. * 



so 

o 



Id 

ERIC 



SM^ILL SAMPLE COMPARISONS OF THE COCHRAN Q 

AND THE MINIMUM X? STATISTJc's 

/ • 



Peter Paul 'Vitaliano 
Syracuse University 



us OEPAftTMENT OF HEALTH. 
X • 1 EOUCATipHiWPLFAftE 

NATlONAt^NSTlTUTEOF 
EDUCATION 

EDUCATION POSITION OR POLICY 



, ' Paper presented at the Annuhl Meeting of <he 

• - American Education. Research' Association 



^4 ^ Ifeshington, D. C. ^ 

April 2, 1975 



Introduction 

* ' ' c 

• • ^- • 

The purpose of this sttJQy was to compare the relative merits, .for 
small samples, of the ^lochran Q statistic (Cochran, 1950) and the minimum 
statistic (Neyman, 19^^9v). These two statistics can be meaningfully compared 
s^ince they not only represent alternative way$ for te.nting hypotheses about 
the equality of correlated proportions, but they are Iso both asymptotically 
distributed as ^ . 

Hypotheses about the equality of correlated proportions are the result 
of situations in which categorical data are obtained from either matched samples 
or from repeated observations on^ subjects from one sample. Since it is assumed, 
in Q, that observations are dichotoraously scored, the two category (c=:2) per 
response situation was considered in this Qtudy. Table 1 contains such a situ- 
ation,' where i»l,...^ n matched groups ( or subjects, for the repeated measures 
situation); joi,..., r matched subjects within group i (or r responses per 
subject i); and the response x. . = 1 or 0 for a success or failure. 



Insert Table 1 about here 



In Table 1, the hypothesis of equal correlated proportions is 

H^: E(T^Vn) = E(^2/n) = . ^ . = E(T^/n), , ' 

where. T. is the number of successes in the j-th column. Thus the implication in 

J 

(l) is that the probability of a success is equal for all r treatments. 

Numerous examples of the experimental situation 'in' Table. 1 kre cited i« 
standard statistical texts (Hays, 1963; Siegel, 1956; Winer, 1971)* One may h^ve 
a need to test a hypothesis, as in (l), in a number of content areas: in psycho- 
metrics one may^wish to test whether the items on a test differ in difficulty. 
In social psychology, only one item or question may be of interest; in this 



situation one may wish to test whether or not a specific group changes its ^ 
response %o this question over time* In clinical psychology, rater reliability 
may be a concern; one might ask - is the proportion of patients rated positively 
identical for different therapists? Final^^^n experimental or ccxnparative 
psychology one may wish to test whether the ppoportion of<^ positive responses, 
in a particular species, is constant for different drugs (levels) or different 

doses of the same drug. * . • 

(/ 

^ Theoretical Development of the Q ajid Statistics 

In developing a test for (l), Cochran assumed that the u^ total for any row 
1 is fixed, and that the expected value of x. ., or the probability of a sdticess 
p , is constant for all r coltamns in row 1. Thus: 

E(x^^)' = - u^/r, and • ^ (2) 

V.(5c^j) = tui/r)(l - u^/r). ' : (3) ^ • 

With fixed, E(u^) = and V(u^) = 0 and, since V(x^p is constant for all 

r cells in row i, . ' ^ 

, V(u,) = rV(x,^)^yCov(x^^,xJ -0 . " (U) 

Cochran furthet aftf^med that Cov(x^j,x^j^) is constant for all Jjftc, thus fr?m (U), 

( Cov(x^j,x^) - -(u^/r)(l - u^/r) • • ' ^ 

r - 1 

The reaults in (2), (3) and (5), along with the assumption that the i rows 
are independent, allowed Cochran to obtain the: ' • ' * - 

E(Tj) = J^E(x^.) = Lu^/r = rTjA = T,'". (6)'. 



and 



i 

n ji 
V(Tj) = E V(x^j) = Z (u^/r)(l - u^/r), 

Co.v(Tj,T^.) -f("iA)a-V-) , (8) 



r - 1 

Given these results, Cochran defined Q as: 



ERIC 



-3- 

* • r ^ 

Q = f(u^)(l - \ ■ (9) • 

where the denominator in (9^) can bfe shown to be eq,uivalent to (7) - (8)* 

' If one assumes that n is^ large, the T. totals- may be expected to tend to 

multivariate normality with common variance (7) and common covariance (8). 

Given that (i) is true, Cochran cited Vfelsh (19^7) as having proven that if 

2 

n is large, the ratio in Q will be distributed as X with r - 1 degrees 

of freedom. However, if one reviews an assumpt^n made in deriving Q 

(specifically in passing from (U) to (5)), it becomes obvious that Q does^^t 

test (1) alone. Instead, Q simultaneously tests (l) and the hypothesis" of 

equal population covariances, namely: 

H : all r(r-l) covariemces are equal, (10 ) 

— o 

or' that all values of (8) are equal. 

It should be apparent that (l), together with (lO) is a much more exacting 
hypothesis than (l) alone. This^s^t least one problem the researcher facea 
when Q is used to test (l). Another problem one faces when using Q can be 
Observed in 'the denominator of (9). It should be obvious that rows where u^ 
is either equal to zero or to r do not contribute to the value of Q . More- 
over, since^ Q is insensitive to these rows one can never estimat^, a priori , 
what power the Q test will have since the effective sample size of Q will be 
less than n. That is, whenever the probability of obtaining a u^ = 0 or r is ^ 

hot equal to zero, sample size attrition will occur. It shall be shown below ^ 

2 ^ 
that is affected by the original samplfe size. 

Thfe minimum X? statistic is an alternative to the Q statistic for testing 
(l). Neyman (3^^9) wrote the foundation paper which defined and theoretically 
developed this statistic. Since then several authors, namely Berkson (1955), 
Bhapkar .(1961, I965) and Grizzle et al (1969) have reformulated the statis- 
tic so that it could be easily applied in a wide variety of situations. 

A ' * • 

5 

I I 



Grizzle's approach is mpst apjiealing for a number of reasons: first, 
categorical data are expressed in terms of familiar linear models. Second, ^^^"^ 
parameters of these models are then estimated and hypotheses about them are 
tested using the well known method of weighted least , squares Finally, because 
the linear model approach can be applied in so many situations. Grizzle's method 
represents a unified approach, both Conceptually and computationally,* for 
applying the statistic to categorical data. Because of these advantages. 

Grizzle *s approach for testing (l) is the one which is cited in this paper. 

Grizzle begins his discussioyi by referring to a general categorical situa- 
tion; this scheme is depicted in Table 2a, where the rows represent s Multi- 
nomial populations and the columns represent c categories of a response. 
Table' 2b contains, definitions of the notation that is employed in the develop- 
ment of the X? statistic. 



Insert Table 2a -b about here 



Given this notation. Grizzle defines a function: 

f ()t ) = a £ , where , f (jt ) is assumed to be any function of the elements 
IxcS csxl 

of £ that has partial derivatives up to the second order with respect to Jt^^ 
and m = 1,2, ...^uiv = s(c-l). For all u functions, one can write the general 
linear model as: 

« 

F(jt) = A )t . , • ' (11) 

uxl uxcs csxl 

where A is a matrix of desired weights, with rank equal to u, so that one obtains 

u linearly independent f(jt)*s. 

^ Given (ll), S is then defined as the sample estimate of the covariance 

matrix of F/p):' S = a vCpV A* , with the rank of^S equal to u. It. 
^ ^ ^ uxu uxcs csxcs <^6xu ^ 

s^iould be mentioned that if any frequency (n^^) is zero, S will be singular; 



in this case Berks on (1955) bas recommended that n^^ be rep:Uced by 1/c, so - 
that is l/(c n^). ^ . • v 

Grizzle assumes further that the fj^)'^ in (ll) can be described in terms 
of a matrix X and a vector of unknown parameters 3 :^ 

F(it) r=An^=X§, . 

uxl uxcs csxt uxw wxl " ^ 

vhere X is a type of design matrix- of rank w - u. Based on (12), one can write 
the sum of squared deviations, of observed versus exp^ctedr linear functions, as: 

(F(e> - X § )• .S"^ (F(p) - X 0, (13) 

r 

' Given (13), Grizzle then cites Neyman ^19^+9, Theorem 1+) as having proven that 

if an expression as in (13) is minimized with respect to p, the .result: 

.b ■= X'S"'h()""Sc's"'S'(£) " '. 

. will contain B.A.N, estimates of the parameters in 3, vhere B.A.N. Estimates are 
/"asymptotically normal, efficient and consistent. Furthermore, the minimum 
■value of (13) that is obtained will, according to Bhapkar" (1961, Theorem 3), be 
the minimum ]^ test statistic for the fit .of the model in (12), namely:. . 

the SS(due to the H : F(«) = X §) = F(£)'s"^F(£) '--b' (X's"^)b . (lU) 
Finally, since (Hi) is the minimum X^ statistic "it will, accordirig to Neyman 
(19I+9, lemma 12), be asymptotically distributed as with u-w degj-ees of freedom, 

■ if (12) is true. • ' . - 

Once a model is defined and tested for adequate fit to the data (as in (II+)), 

> ■ , ' \ 

a test of a general linear hypothesis: 

^ wxl - dxl ^ , . 

can be obtained by the same weighted least squares method. In (I5),' C(dxw) is . 

,of full row rank d^w. Thus (15) represents restrictions on the original para- 

meters in (12)^ Using the same rationale as ii^ (13) 'and (lU), one^ obtains a 



: . -6- 

minimum X?' test statistic for (15) as: 

i *. c 

.,SS(C Q - 0 = b'C'(C(X'S"-^X))"-'-C')"i-Cb, 
with d dej^rees of freedom, if (15) Is true. 

In many cases there is only one population, as in Table 1, ^nd the objective 
of the analysis is to study the relationships among vays of classification of 
the sample units. Many situations of this kind can be described by the model: 

F(rt) = 0 . ■ ■ - (16) 

uxl 

♦ 

This fits into Grizzle's general model (12)' by. setting X.= 0 , the null matrix.^ 
The test statistic for (l6) is then: ' * • 

' SS(F(n) = 0) = F(p)'s"S(2), , ^ ' . (l?) 

which is. asymptotically with u degrees of freedom, -if (l6) is true. 

Since the situation in Table 1 is a one population problem, the exppssion 
in (17) can be used as the minimum *.X^ test for the hypothesis in (l). This 
can be presented through the use of an example. Table 3 contains a i-eproduction 
of the repeated measures data presented in Grizzle's Table h (1969). Forty-six 
subjects were each given drugs A, B, and C. Some had a favorable response to 
one, some to two and some to all three drugs. 

Insert Table 3 about here 

Before the X? method for handling this situation is discussed one should 

/ 1 ' * 

note the differences between the situations in Tables 2a and 3: the former Table 
contains s populations, 1 response per subject and c categories per response, 
while the latter- table contains 1 population, 'r . responses per subject and 2 
categories' per response. Howeve^r, although this latter arituation has two cate- 
gories it is still multinomial if one regards the possible response pa.tterns 
for any one subject as the experiment's mutually exclusive categories. Thkt is, 
for each of the three responses in Table 3 there ar^ two possible categorizes. 



-T- 



therefore, there are a total of c^(= 2^ = 8) possible respons¥ patterns. Since 
any one subject can have one and only one response pattern his response vector 
contains eight elements, one of which is equal to one and seven of which arc , 
equal to zero. ■ 

Given this configuration the hypothesis to be tested in Table 3 is that the 

marginal proportions are equal: 

Hq J E(T^/n) = ECTg/N) = Ed^/N) or ; ^ ^^q^ 

where (l8) is identical to (l), for three responses. 

This hypothesis may be -vhritten so that it is ammenable to Grizzle's approach. 
Given Table 3, one can see tdrnt ,(l8) implies the hypothesis : 

Yet this hypothe'sis can be rewritten as:* 

^21+57 • (19) 

^ ' • • '^2""3^^"6""7 = ° 

This hyppthesis can be readi^V adapted to that in (l6) by choosing A such xhat 



. fo 1 0 -1 1 0 -1 o1- 
• = \_0 . 1-1 0 0 1-1 0 J , 



■ and (19)' ■then beccxnes 

• Hq- ^^-^ . = A £ = 0 . ' (20)- 

■ From (20), tiae estimated covariance matrix of F(£) is A V(£)A', and the test 
of the fit of the model in (20) is given by: 

3^ = E'A'(AV(£)AT^A£ , (21) 

2 

'where if (20) hdlds, (21) is asymptotically distributed as X with two degrees 



ERIC 



rvj • . . . 



of freedom^ Table h provides /he computations which are necessary for obtaining 
(in (h)) and (in /2lj) for those d&ta in '^pible 3* 

^ 

Insert Tabl***f about here 



biven the theoretical development of the Q and statistics, it should 

be clear that while Q tests (l) and (lO) simultaneously,* X^^ on\i tests (l). 

X 

Now that the foundations of these statistics have been presented, the mefthodology 
employed in this study shall be discussed. 

Methodology 

The properties of the small 'sample distributions of Q and were in- 

^^'vestigated (for r = 2) through the use of enum^tion methods. Once a specific 
parent population^^ms defined these two statistics were calculated on all possible 
samples from' this population so that exact distributions of Q and X^ could be • 

formed* - * * \ 

Gi^en thQ, hypothesis that Q and tesi, there are four pos^sible combina- 

tions of (1) true or false and (lO) tru^ or ffeilse. These combinations define 
four distinct population types; they are presented in Tabl,e One should notice 
that the number of I'esponses is limited in certain population types: if r 
then (10) can not be false and so population types B and only contain r>2 
responses. Also, if r>2, (l) can not be simultaneously false while (lO) is 
true; therefore, type D populations only contain two responses. 



Insert Table" 5 about here 



ERIC 



The behavior of Q -and X? in three response pop.ulations is currently 
being investigated. A decision has nolb yet been reached as to' whether pr not 
enumeration 'will be possible in .the three response case. Patil (1975) has 
provided a .mi^thod for enumerating Q for any r j however, it is not yet clear 

10 



/ 

' ■ ' • . -9- 

whether can be as easily enumerated in the r^S situation^,. If enumeration 

ife found to be intractable, the approximate distribution of will be simulated 

through the uso of Monte Carlo methods.''^ 



It should be mentioned that Tate and^ Brovn (1961+, 1970) also* studied the 
effect of differing sample sizes on the distribution of Q. They enumerated the 
exact distribution of Q for selected values of r and n • However, since 
these authors studied Q alone, emd did not relate it to any other statistics, 
they did not find it necessary to generate distributions from populations in which 
sample size deletion occurs (where the probabilities of trivial patterns 'are non- 
zero). Furthermore, in studying Q in isolation, Tate and Brown only had to 
enumerate distributions of Q in type A populations. In the present study the 

distributions of. Q ^d X? were generated from type A and D populationsV-^ 

1 ' 

Moreover, these populations were predetermined so that the probability in the 
triviaX |>attems varied. ^In this way, the effect tha,t sample size attrition has 
on the discr^petncies in the exact powers of Q and , for any sample size, 

V. 

covLLd be studied. 

I Results and Discussion 

Before any results are reported, an important relationship between Q and 
X^ (for f = 2) should be mentioned; once this relationship was established there 
was no doubt that the small sample power of X? wduld be greater than the small 
samjple power of Q (at least for'r = 2). 

The statistic will now- be shown- to be monotonicaily related to Q. In 

r 2 / ' 

the two response case there are M = 2 =2 = k possible patterns. If these 

patterns are denoted as; (ll), (lO), '(Ol). (OO) and their respect4ve frequencies 

as: n^, Pg, n^, nj^,.then it can be shown that for all nontrivial cases (n^ ^ n^), 

X^ will always be greater than Q. 



... . • . ' . 

Given the above patteiri frequencies, it is well knovm that Q (hfere* 
identical to the McNemar statistic; McNemar, 19'=^9y can be expressed as: 

. Q . 2L. . . (22 

• ' - • "2 ^ "3 

Furthermore, if -one manipulates the expression in (21), It can be shown that 

for '.T*=2, (2l) can be expressed as: ' . ^ 

' . •" / <"2 - "3'' ■ , ' . ■ ■ ■ (23) 

N ■ . 

*• Given (22) apd-(^ one can show that: 

'1 = 1 - 1 



X? Q N 



or 



(21+) 



1 



ERIC 



■*From (2U) one can observe that (for all cases where, n ^n) w^ll always 

reject (l) mote often than Q . The results belp^re cqinsistent with Hhis 
finding. . , ' • 

' . * The exact a levels of Q * and 

Table 6 contains the P(Q -^>^^,oi,l^' ^^4^^^,01,1^' ^^^^^.05^1^ ^""^ 
p(X^^X^ )• These probabilities were obtained from exact sampling distribu- 
tions based on three type A poptdations, A total of nine sampling distributions 
were generated from these populations; these distributions ordered according 

to the actual sample size used in the calculation of Q ' (from the smallest to 

* » 
the largest N) ' * / 

Insert Table 6 about here 

In all eighteen comparisons of Q and , the exact flrobability of Q is 

* ' * 2 * - 

^ closer to the asymptotic nominal probability (based on the,X distribution) than 



12 



j.s the -exact probability of Q is conservative in fifteen of th.^ eighteen 
ccHnprisons '(in all nine at = .'Ol),' while ^ is .liberal in all eighteeli • 
comparisons* As N increases, t^e upp4r tails of both the Q and distri- 

^butions converge to the upper tail of the X distiributiqn, although Q sfems 

to converge- more rapidly. K6reover,^as expecrted from,\(2lt),.as N incr£asej^,the'^ 

exact distributicJhs of \Q arid X? 'app?^oach one anqther.' ' :.\ -/ \ 

• . ^ ^ 

Based on the results 4n Table' 6 one can^ conclude that, for: r[= 2y N « ^^O • 

^ 2 
and (l) dnd^ (ip)'true, tha distrib<;tion of * comes closer to the X disrtri- 

butioh t)ian does the distribut:j.on of XT; thus, in these situations, Q can be 

^aid to provide a better Vest than • In general the discrepancies of these 

two statistics §ire not as great* for a = .05 as for a = .01* 

• ^ ^e exact power of Q and ^ 

Tabid 7 contains* thirtjr two, exact difS*ributions which were enumerated ' ♦ 

from twenty- fixe, different type D populations.^ The distributions are ofderect 
iaccordlng to the magnitude (from smallest to largest) of the noncent^ality parameter 
( A ) iz? their respective parent populations. The estimated asymptptic ppwer in 
any population is also given; this was obtained by referring the appropriiite A , 
QC and^degr^es of free(&m to<tables of the noncentral X distribution (owen, 
196^). Th'e estijnated asymptotic power in a parent population serves as -a reference 
point for comparing the exact poverS'9f .and X^; that is, if the ^sjpnptotic 
power in a parent population were very high (say .95) it would be difficult to 
detect differences in the exact powers of Q and X?, since these two ste€i^tics 
^ would both be quite powerful. It was arbitrary as to whether the approximate 
asymptotic power of Q or ' XT would be used as the refferejrice point. It waS 
'de^^ that the asymptotic power 6f }^ be jased and so the X for a particular ^ 
population was obtained by substituting, into (23), the values of .Nn^ an^ NJt^ 
in that population. ' " ' r 



' -12- . 

' ^ 

InseA Table 7 about here 
, ▼ *• 

As expected trm {2k), is consi'stently more powerful than Q; in 

rfll sixty-four comparisons,, P(X^.- X^) is greater than P(Q-^X:^)* One might 

''point out *a • fallacy in this type of comparison/ That is, one might argue that 

Vhile Q is less powerful than , it is also a much more, conservative test.^ In 

' this^ sen§e> it would appear to be unfair to compare the exact powers of Q and 

" by comparing » ?{q,2x^) to P(X^2x^)*. If; on the other hand, one were to 

find^thB values of Q«=K and = L such that a = P(Q>K) = P(X^ ^ L), then by 

definition, the exact pollers of Q and X^ would also be identica],. The problem 

with this argument is that the-- researcher does not know the exact distribution Qf 

either Q or X? (he cannot obtain K 6fi)s^ and_ in using the'X^ approximation, 

he assumes that P(Q >)^')\^. P<X? - ->^^> «• Given this assumed constant value 

. of X^S ll) will be rejected more often^Wth X? than, with "^--^nd in this sense 

it is legitinat^ to say that"' X? is more powerful ^thian The logical -question 

is then - in which* situations would ^ one 'statistic be preferred over the other?-^ 
# 

Th? trivial response' is that Qis to be preferred when type I errors are of major 
conaern, while XT is to be preferred idien type II errors are of major concernii 
However, a raoreJ specific set of recommendations can be offered if one is willing to 

make a concession* - - . • * 

Suppose a researcher has a sample size^ of N « 20 or N = 30; if he were to 
use the X^ statistic to test a hypothesis, the exact type I error rate^' at . 
asymptotic a = .01, would only be partially controlled if ' N were 20 (J^n Table ^ 
' 6 the average error rate for-N=eO is •0U5); however, if n were 30, the exact' 

rate would be controlled adequately (the average at N=30 is •Ol85). Given' 



error 



this scheme one might ask, if thes^ exact error rates were Ijolerated in X^ , 
unde'r what conditions would the exact powers of X? and Q be most discrepant? 
If this question could be answered, then some valuable recommendations could be 
offered to researjchers 'who were willing to tolerate the above type I error rates. 
O The results in Table. 8 provide a partial answer to this question. 



. / • ■ • -13 



ins^rli Table fl.aboul? liere 
V 



Table 8 represents a' re -examination of the' reaults\obtc^lned for^'tlie 
.twenty exact distributions (in Tfetble 7) which were generttt^d from'sampleB 'of 
N B 20 or N = 30* The rows and columns in this tabi^ resjfe'ctively' represepT^he 
asymptotic powers and percent deletions in the pfitrent popfflaitions of each of these.^^x 
twenty distributions. The cells In the body of the table contairf' the fallowing 
Information: the label of the distribution, as.- iq Tab^e 7l. in pa^renthe^sis, the 
ratio Qf the difference in the exact powers of • etnd. ' ft divided by the exact 
power of (hereafter this ratio will be referred io as RDPh';anl/^ 

the sample ^ size used^to obtain the distribution. It should i)e mentioned that 
■RDP is less affected by the asymptotic power* in a parent populailon. than is a 
mere difference in the exact poweVs and, in this sense, Telatiye differences are 
more informative,- than are absolvit^- differences* 

' If one. observes th^ Rl3?'s ln,;':^ablk 8 it shoxild be evident that while 2^ 
is always recommended over^'.Q. <iri' terms Vf having greater power), the use of 
is-almost c^'* neaessity In ceSrtaln, sitiiations/* As RDP approaches zero, the exact 
powers of ? X? and Q approalch .one another^. Gi^en this, it^shoxild be clear that 
. under, certain circumstances the .upe of;-. -XT. is "highly "T^ecommended over* the use 

"of ft :* very, iarge RDP 's occur; when asymptotic power is low., and the peir cent ^''w* 
, dpletipns ia:iarge/(pop\iLatibns lr3'> 7 and 9); however, if^symgptotic ^ower is 
^ ' v.^/greBtV eye^ deletions ^ not appreciably affect the RDP (popiilations 29 and 

*V w /'-W. /: * ^ ' . ' 

' •' Giveii a range, of low asymptotic power (all populations whose no^nc^Dtrality 

!-/^VpaJ«^ ajte les^ than 5),' ^clear that as the ^ercisnt deletion increases^ ^ 
'i^f>^ncJreases dramatically. Ih fa^ct^. i)opulationa 8 apd 12 have more t^n three , 

'•■ 4,-. • ■ ■ '] ■: P : ^ 



1 . ■■'.it .■ ^ 



ERJC V:':!-! ! i i. i 



times th^' ^ymptotic power of pbpulationS 1 And' 2, yet in spite of theste 

. f j ' / , ^ * • ' . 

differences, the larger percentage deletion-' in 1 and 2 produces RDP'^s which 

are tnuch larg^thah in 8 and 12. One v^y .telling, result is in the difference 

in the RDP*s of 8 and 9, These jiopuiations he^ve identical asymptotic power, yet 

9khas more than twice the number of deletions as 8; simultaneously, the RDP in 9 

is twice as large as in 8 • 

Based on the results in Table 8; one can make thi . recommendation: if a 

researcher is ;jilling to tolerate the exact a levels of when N = 20 or 

(the latter level^ls'' certaiply tolerable) then should always be used in 

place of. Q ; however, JLf the researcher obtains sample frequencies where the 

non-trivial frequencies are simultaneously close in value and small, say 

jrig - n'^ '< 3 . and the percent deletions is greater than .8, then the exact 

power ol^^ 'PC^ ^is so much greater than the exact power of % that one should 

• ' 2 ' 

certainljr^use the statistic. 



H f 



f . • ... . / 

t • REFERENCES 
t. 

Berksanj*J. Maximum likelihood and minimum X estimates of the logistic function 

Jotirnal of the American Statistical Association ^ 1955 ^ 50, 130*- 162. 
Bhapkar, V. P. Sane test^ for categorical data. Annals of Mathematical Statistics ^ 

1961, ^, 72-83. ' * ' ^ ^ \ 
Bhapkar, V. P. Categorical data analogs of some miiltivariate tests, S. N. Roy 

Memorial Volume ^ I965, University of North Carolina Press, Chapel mil, NC 
Cochrem. W. G. TheV comparison ^of percentages in matched samples, Biometrika, 

1950; 31, 256-66. 

Grizzle, J. L., Scanner, C. F., a^d Koch, G. G. Analysis of categorical dati.a by 

linear models. Biometrics , 1969, 25> U89-501+. 
BaysJ, W. L. Statistics for Psychologists . New York: Holt, Rinehart and Winston, ^ 

■Ihc, 1963. ■ . 

•Mc Netaar, Q. . Psychological Statistics . New York: John Wiley, 19^9. 
Nejjma^, J. Contribution to the theory ^f the X test. Proceedings of the 

Berkeley SymposiXim on Mathematical Statistics and Probability , University of 
California Press, 19^9, 239-;273. ' . 

Ovei, D. B. Handbook of statistical tables . Reading, Mass.: Addison-Wesley, 1962. 
PatiV, K. Cochran's Q test: Exact distribution. Journal of the ^erican Stati.s .- 

I * ' ' ' * 

Itical Association, 1975, in press. 
SiegeV, S. Nonparametric : statistics for the behavioral sciences . New York? Mc 

iw-Hill, 1956. . ^ I** ' 

Tate, iL W., and Brown, S. M. Tables for comparing related sample percentages and 
foA the median test. Philadelphia: Qraduate School of Education, University 
of Fennsylvania, 196k. ' , 

Tate, M. V\, and Brown, 5. M. Note on the Cochran Q test ", Journal of, the 
Americafi Statistical Association , 197O, §3, '155-l60. 

' m 17 \ 



tfcilsh, J, £• * Concerning the effect of intraclass correlation on certain 

significan^jpests* Annals ' of Mathematical Statistics , 19^7^ !§, 88- 
Winer, .B. J* Statistical principles - in experimental design ♦ Nev York: McGrav 
Hill, 1971. , ■ • \ 




Table 1 : A Matchec^ Group (or Repeated Measures ) Categorical Situation 



C 



1 MATCHED GROUPS (OR SUBJECTS) ^ | 


MATCHED SUBJECTS (OR REPEATED MEASURES FCR ONE SUBJECT) 




•J = 1 


2 




T 


r 


1=1 




^12 




^ir 


"i = \ 


2 . 


s ^21 


^22 




^2r 




• 
• 


• 
• 


• 
• 




• 
• 


• > 
• 


n 


nl 




/ 


X 

nr 


u. 
n 




1 






'T 

r* 





f 



19 



Table 2a : , General Categorical Situation for £ Multinomial Populations 



POPUIATIONS 


' CATEGORIES OF , RESPONSE 




» .J = 1 


2 


• • • 


c 


TOTAL 


- !i = i 






• • • . 


Ic 




2 




■ "22 


• • 


2c 


1 


• • 


- • • 


• • 


• • • 


• • 

i 


• • 


s 


sl 


s2 

— ^ — ^ ■ ^ 


• • • 


n 

sc 


1 



Table 2b : The Notation used in the Development of the Statistic 

• • • 



Ixc 



Ijccs 



^ 2' • • • > i ) 



n. 



S ' Pi = ^Pil' Pi2' • 
i. Ixc 



Variance { 2± ^ V (n^^) = '_1 



cxc 



*i. 



, p ') ; n ' t= i-th sample size, 
ic 1 • 



ic il 



■Vi2 ^c^^-^ic^ 



y(£j^) = V( n^)- , with •n^j'^replaced by p^^. 
= block diagonal matrix of V(£j^) 



cxc 



V(£)- 
cs x cs 



20 



c 



1 



4> 



oil 

81 
5 

I 



ION 
H 



O H 



I 



Q o 



O CO 
P W 



CO 



O 



I 

o 

2 



^1 



H 

P4 



CO* 
Pi 



CVJ 



CO 



U 



Pi 



U 



VO 
Pi 



u 



VO 



.Pi 



VO 
U 



CO 
Pi 



VO 

u 



CO 



H 
U 



' u u 
00 W 



■ 



.1 



+ 



CO H 



CO 



u 




VO 
c 




+ 




C 


CO' 
CVJ 


+ 


D 






+ 




H 

C 




n 




6? 





^ CO 
CO CVJ 



a 



CVJ H 



+ 



suM%%vd asuodsan aiqT.ssoj 4X^3^3 | 

' ■ ' ' ■ 



(0 



-p 

•H 
•H 



5 

o 

u 
o 



001 



51 



05 

o 

•r-» 
CO 

C 
O 

•H 
-P 



^1 
o 

w 
o 

•H 

-P 

•H 
•P 



0) 

p 

c 
cd 



Ik 



^1 

H 




Table 3 : The Four Possible Population Types from which Distributions of 
§1 Q'"!^ can be generated > ^' 







— 7~ 

^ (1) : £(1;^^).= E(T2) = ... = 






true' 


• 

FALSE 






■type- A Populations 


Type D Populations 


Ok 

•> 

H 

1 


TRUE ' . 


'.-o'is true for Q and X?: ' 
, for r = 2 and r = 3 
1. Sampling distributions of 
Q and X<r were both compared 
to the )^ 2 distribution and 
to each o'^-her. Specific 
interest wci,s in the exact 
versus the asymptotic a level 
of each test and the effect 
of seunple size on the adequacy 
of theX^ approximation to 
the^upper portions of the Q 
ancL uisuriDuoionb • 


H .is^false for Q and XT: 
: ^ for r = 2 ^ 

1, The exact powers of Q 

- and XT^were studied in relation 
to asy.':>t^otic power, sample 
size and sample size deletion. 

2. Specific- interest was in 
the degree 'to which discrepan- 
cies in the exact povxer of Q 
and X? are affected by sample 
size deletion, sample size 
and asymptptic power. 


0) 
0) 

u 
a) 

EH 


r ' 


2. The effect of sample size 
attrition vas studied in Q 
along wi^h Jbhe effects of 
sample deletion on differences 
in' the exact type I error 
rates of Q and X^. » 
, * * 


- \ 

m 


0 

a 

H 
a) 

O 

H 


r 


Type C Populations 
H . is true for X? and t^^tlse 

fSWq-. • ^ 

- ^he following phenomena 
are'^^'^^urrently under inves- 
tigfition: 


^ Tjfpe B Populations 
^6 is false for Q and X^. ^ 
r = 3 

The same comparisons,, as in 
Type Depopulations, are 
currently under investigation; 




w 


1. Th*^ effect, on the distri- 
bution, of Q, of the departure 
of (lO) from equal co variances 

2. The Effect of stbple size 
and sample size attrition 'on 
the e^^act power of Q. 

3. The exact a levels in the 
distributions of the 
statistic. \ 


■f - 

I"* 








I , ^ 



3 



O 
• 

II 

d* 


• 

CO 
At 


' CO 

00 

. o 


VO 
rH 




CVJ 








VJ 


VO 


M 
























• 


ON 

• 


m 

• 


ON 

o 

• 


VO 
CVJ 

Q 
• 




ON 

CO 

Q 
• 


H 

Q 
• 


H 
• 


ON* 

00 


H 
O 
• 

11 

o 


vS 

' • 

VD 
A» 

>< 


CO 

o 


CVJ 

o 


H 


CO 
ro 


00 
CVJ 


VO 
(\\ 


CO 
H 
O 


CO 
CO 

O 


ON 
VO 
rH 

o 


M 

1 
























• 

VO 
A» 


8 


o 


H 

• 8. 


* CVl 

8 


H 

CO 

8 


ON 

8 


O 

CO 

o 
o 


H 

§^ 


8 


1 


O 
M 




















a 


o 


o 

CVJ 


o 

CVJ 


o 


O 
CVJ 


o 

H 


o 


O 
CVJ 


O 

rH 


> 


— 1^- 




















ORIGINAL 
SAMPLE ' 

sizp 


o 

H 


o 

H 


H 


o 

CVJ 


o 

CVJ 


, O ' 
CVJ 


o 

CO 


o 
ro 


O 










t 




i 


- 








SAMPLE 
SIZE 
USED IN 


VD 


■ - 

CO 
m 


CVJ 
H 


CVJ 
H 


VO 
H 


CO 


CO 
H 


\ 

CVJ 


vo 

CO 


;* 


o 

M 
H 
















i 








VVJ 


f\ \ 

\M 




vVJ 


rH 


CO 


vVJ 




























M 






















M 
K 

en 

CO 


H ' 


CVJ 


CO 






VO 




CO 


ON 


1 


















< 



w 
c 
o 

•H 

■P 



0) 
w 
0) 

x: 
+^ 

■p 
c 

0) 
0) 

•H 

-P 
O 
0) 
P< 

w 
0) 

CO 

c 
(d 

CVJ 











O 






H 


^ • 


O 


O 












II 


II 


II 


O 








CVJ • 














X 








II 




J- 


OO 




• 


• 


• 




11 






CO 


• 11 


H 


• 








ro 




OO 


OO 




















u> 








J- 




ro 




• 




• 












II 


II , 


II 








H 


CVJ 


CVJ 


CVJ 


O 








CVJ • 




















CVJ 




II 


• 


• 


• 












11 




II ' 


VO 

• 


H 


H 


H 


VO 










• 




• 






OJ 


CO 


1 



ERIC 



24 



4 



IfN 
• 

It . 


CVi H 

•\ 

t 

Al" ' 


s 


\J 
f\ 

r\ 


o 

IN 




H 

, 


-4 

■M . 

rN. 


:r . 
d- ' 


7n 




0 
t 


D 
O 
?N< 
A 


yN( 

A 


3 

r 

fN^ 


^ ( 

HC 


VI - 

D 1 


O ( 
- ( 
OC 
3 V 


OV 

^: 


DC 
O 1 
1 


5^ 

'^ 1 
-1 


:) V 

^ ! 

-c 


i\ 


> V 

r I 
^ 1 
:> \ 


3 1- 
^- 

■N 1 
- \ 


H < 

r 1 

"NV 

=- 1 


^ ( 
-c 


*> t 


^1 


5! 

D 1 
D - 


JNC 


^ 1- 

i 


h. 
~~ 


'A 




<^ 
' CM H 

X i 


> 


£) 

^ 




o 

CVI 


o 
:\i 

£i 
ro 


o 

!^ 


/N 




'O 
It < 

no 


8 

\ 


^: 

A 


X 

A< 




^< 


OC 
A. 




o 

fNV 


O ! 

^1 


5n1 
M V 


t ' 


o < 


OC 

J « 


l\ 

H V 
D V 


rN ( 

81 

0 1 


M 

- 1 


?'^< 

D ^ 




r\ 
M 

5^ 




JNC 

r\c 
7N, 


D' 
0 ^ 
^ . 

7N 


J\ 
0 
A 
7N 


- 


H 
O 
• 

II 


CV) r-^ 

o 


8 


H 


o 


LTN 
O 

rvi 


O 

IfN 
CM 


CM 


LfN 

& 


O ' 

ro 


^• 


JN 

D 




B 
«^ 


[M 
^. 


% 


d- - 




d- 

a- . 


OV 

^: 

t 


0 1 

?; 

A 


* 




A ( 
pv 


D .J 
3C 
H 1 
D 


1 

0 c 

-V 

A 1 


r. 

fNV 


H ( 

ho< 
t c 
D ^ 


D 1 


?NC 
Cv ' 

D - 


o 

H 1 

S 


::> V 


D 

0 ( 


D 
A 

O 


CV) 7* 


:::: 

H 

P 


P 


x> 


no 
ro 


H 

C7\ 

ro 

H 


ON 
LTN 


CM 

LA 

?^ 


no' 
ro 

H 


D 


O 
-1 


H 


D 

:m 




D 

^i 


5n 




A 


d" 


A< 
O 

t' 


o 


7N( 

O 1 

O- 


M< 

A , 
"TNj 


D V 
-\ ■ 
3n I 
M 


0 < 

7NC 

1 

O- 


M C 
D 

AC 
t - 


ST ' 

0 ( 


H 

A ' 


H 

H < 
M 1 


A 


o 
d- 
y\ 

t c 


D 

0 < 


3N 

D 
\J 




1 

-p c 
<u o 


O 


O 
X> 


00 


o- 

OJ 


o 
ro 


Q 

X) 


CTN 




/N 

o 


D 


H , 


• 

X) 
d" 


• 

T 1 


D 
IM 


• 

A 


D 

W 


• 1 > 


D 

u - 


D 


0 

4 V 




- 

• 

0 

r\ 


:> 

M 


M ' 


\i 
\j 


VI 


rH 
o 


\j 


• 


• 

o 

O- 


D 

d- 


D 




H 


o 

CVi 


o 

ro 


o 

CO 


o 


O 
CM 


H 


O 

no 


O 
CM 


o 
3J 


A 
H 


O 
-1 


O- 
no 


A 
H 






H 




D 


D 


D 
M 


O 


D 
O 


D 
H 


A 


o 


M 


o. 


0 
t 


3 


A 
H - 


D 
d- 






x> 


X> 


CVi 


X5 


H 




ro 




no 




ON 
• 


i> 


H 


£). 


d- 
-i 




t 
H 


^- 


M 


% 


0 


i> 
H( 


0 




d- V 
M 


0 


M 




0 


0 . 


d- 


\J 




Asymot . 
r 

Power 


H 

V 


H 

y 


H 
A 


H 

A 


CVI 
V 


CM 

A 


CM 
A 


CVI 
A 


CM 
A 


CM 
A 


no 
V 


no 
A 


A 


. 

V 


d- . 


A 


d- 
A^ 


A 
V 


A 
A 


A^ 

A 


0 V 
V 


0^ 
A 


0 V 
A 


OV 
A 


0 
A 


V 


V 


A 


A 


0 
V 


V 


:7N 

,A 




Non- 
Central 


u; 
-p 


• 

» 


• 


CO 

o 

CVi 


CVi 


C7N 
CM 


CM 
CO 


ro 
ro 

ro 


ro 

LTN 

ro 


no 

LA 

ro 


A 

no 


LA 

X) 
ro 


H 


:t 


A 


r) 

A 


O 

A 
A 


J\ 
A 


? 

0 


S 

— 


M 


A 
^< 


>o 

0 


0 < 
H 

0 


0 ( 
0< 


0 
0 

0 






D 
H 


7N 

• 

H 


0 
H 
• 


d- 

A 
• 

d 


D 
D 
• 

O 
-H 




PATTkRN PROBABILITIES ^ 


o ^ 
o « 


O 


o 


o 


o 


\S\ 
H 
• 


O 


O 


O 
H 
• 


O 


^- 

O 
• 


O 


O 


0 


-4 
• 


D 


-1 
• 


A 
D 
• 


D 


• 


• 


D 


D 




D 






• 


• 




D 


M 
• 


— ( 
• 




H m 
o « 


H 
• 


o 

• 


o 


CVi 

• 


CM 
• 


• 


O 


ITN 
H 

• 


O 


CM 
• 


CM 
• 


no 
no 
H 
• 


no 
no 

• 


• 


• 


• 


A 
H 
• 


P 

5 
• 


A 
• 


. • 




>o 

• 


H 
• 


• 


M 
• 


A 

H 
• 


o 

•o 

• 


\i 
• 


O 


• 


• 


H 
• 




O .CVi 

H V: 




m 
ro 
H 


S" 

o 






ro 
ro 
ro 


H 


ITN 


ir\ 
H 








no 
no 

A 


o 


no 


D 


A 
A 


■n 

"O 


A 

d* 


A 


>J 


"O 




D 
O 


0 


A 

o 


A 


0 


0 
!M 


O 


A 






H 
H « 
H 




T) 


CO 
CO 

CTs 


CM 


\s\ 

H 


^ 


c^ 


ro 


lA 
X> 


ro 
ro 
H 


H 




1 


H 


■o 

A 


>J 


A 


o 




>J 


0 


8 


^J 


\J 


>J 


\J 


S 


H 


TO 


■o 










H 


CVi 


CO 




in 




« 


CO 


C7> 


O 
H 


r^ 




m 

H 




lA 
H 
r g 


H 






c^ 

H 


O 
CM 


H 
CM 


8j 


ro 

CM 


-d- 
CM 


lA 
CM 


CM 


CM 


X) 
CM 


c^ 

CM 


o 
m 




8i 





CO 
0) 

P4 



05 



ERJC 



00^ 



■m 

CO 
• 



CO 



o 



m 



o 

0^ 



• o 

in 



VO 



V 



O 

H 

VO 



CVJ 



VO 
00 



V 



O 
CO 

VO 
• 

on 



m 
o 



V 



o 

''00 
ITS 



ON 



V 



o 

CO 

VO 



o 
cu 

00 



ON 



o 
cu 

CVJ 



00 



o 

CO 

H 

VO 



2i 



H 

VO 



ro 



V 



4^ O 
03 

VO 
ro 



V 



O 
ro 
" — \ 

CVJ 



H 



C7N 
ro 



V 



o 

*ro 
ro 



ro 



O 
ro 

ir\ 

CVJ 



00 
H 



00 



VO 



V 



o 

OJ 

ro 
ro 



O 
CVJ 



O 

H 

ro 



JOl. 



O 
ro 

ITS 

VO , 
CVJ. 

CVJ 



O 



VO 



V 



ro 
H 



.00 



VO 



ro 

00 



O 

,oo 

VO 
ro 

ITN 
H 



ON* 



o 

ro 
ro 



OJ 



o 
ro 

ro 
ro 

00 

H 



CVJ 



O 

CVJ 
OJ 

ir\ 

OJ 



VO 
CVJ 



00 
00 

00 



VO 



6\ 



o 
d 



VO 
VO 



VO 
H 



V 



V 



26 



